[Day 17] Oops！Golang - 讓我們來抓出吃資源的兇手！ - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

第 12 屆 iThome 鐵人賽

DAY 17

DevOps

遠離 DevOops系列第 17 篇

[Day 17] Oops！Golang - 讓我們來抓出吃資源的兇手！

12th鐵人賽 golang pprof flame graph

rainforest

團隊神龍特攻隊-為了燒肉不小心成為一條龍

2020-09-25 10:02:25

11120 瀏覽

分享至

是否有看過資源使用數據圖沒有降下來的情況？

是否有遇過機器資源突然吃光的情況?

看code看不出原因的話！這時候就要需要進行效能分析了

說到效能分析就會想到 pprof
本篇文章，先帶大家使用一下golang的pprof套件
https://golang.org/pkg/net/http/pprof/

檢測功能主要有

cpu: cpu profile 是在哪邊花費CPU的時間。
heap: 記憶體當下以及過去的使用情況，並檢查記憶體洩漏

「Heap」由於是動態配置的記憶體空間，其存活時間不規律不可預測的，故需使用者自行回收空間。
所以若GC處理不好，就會發現有程序狂吃記憶體的情況。

threadcreate: Thread的線程

goroutine: Goroutine profile 報告所有目前 goroutine的 stack追蹤。

「stack」用來儲存 Value Types (Primitives)的地方，其特性是 LIFO （後進先出），用來儲存物件的 stack 與 run-time 的 call stack 運作原理是一樣的，run-time 的 stack frame 包含了：
Parameters：函數的參數
Return address：回傳位址，當func執行完，從哪行code繼續執行
Local variables：區域變數
來源 - Stack vs. Heap

block: block profile 顯示 goroutine在哪裡阻塞（含timer channels）的等待。預設是關閉的，需要使用 runtime.SetBlockProfileRate 去開啟它。
mutex: Mutex profile 報告鎖的競爭. 如果您認為由於互斥鎖爭用而無法充分利用CPU. 預設是關閉的，需要使用 runtime.SetMutexProfileFraction 去開啟它。

先教大家使用

請大家在code內掛出一個http服務

http.ListenAndServe("localhost:6060", nil)

example

package main

import (
	"fmt"
	"net/http"
	_ "net/http/pprof"
)

func main() {
	_, err := fmt.Println("Hello, ithome")
	if err == nil {
		gorace()
	}
}

func gorace() {
	c := make(chan bool)
	m := make(map[string]string)
	go func() {
		m["1"] = "a" // First conflicting access.
		c <- true
	}()
	m["2"] = "b" // Second conflicting access.
	<-c
	for k, v := range m {
		fmt.Println(k, v)
	}

	http.ListenAndServe("localhost:6060", nil)
}

打開 http://localhost:6060

有簡易的Web可以看報告

若想看圖表

go tool pprof http://localhost:6060/debug/pprof/heap

這時候會產生一個檔案 Saved profile in XXXX

Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /Users/$mypc/pprof/pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz
Type: inuse_space
Time: Sep 23, 2020 at 4:50pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof)

使用 pprof UI介面打開檔案

go tool pprof -http=:8080 /Users/$mypc/pprof/pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz

Serving web UI on http://localhost:8080

使用瀏覽器開啟 http://localhost:8080

Graph

Flame Graph

Top

Source

Oops - 效能會有差

Can I profile my production services?

Yes. It is safe to profile programs in production, but enabling some profiles (e.g. the CPU profile) adds cost. You should expect to see performance downgrade. The performance penalty can be estimated by measuring the overhead of the profiler before turning it on in production.

You may want to periodically profile your production services. Especially in a system with many replicas of a single process, selecting a random replica periodically is a safe option. Select a production process, profile it for X seconds for every Y seconds and save the results for visualization and analysis; then repeat periodically. Results may be manually and/or automatically reviewed to find problems. Collection of profiles can interfere with each other, so it is recommended to collect only a single profile at a time.

要開啟pprof及相關參數的話，請注意！文件說明上有提到建議一次只收集一種 profile，設定的參數可能會相互干擾。